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This article presents a complete scheme for the development of Critical Embedded Systems with 
Multiple Real-Time Constraints. The system is programmed with a language that extends the syn- 
chronous approach with high-level real-time primitives. It enables to assemble in a modular and hi- 
erarchical manner several locally mono-periodic synchronous systems into a globally multi-periodic 
synchronous system. It also allows to specify flow latency constraints. A program is translated into 
a set of real-time tasks. The generated code (C code) can be executed on a simple real-time platform 
with a dynamic-priority scheduler (EDF). The compilation process (each algorithm of the process, 
not the compiler itself) is formally proved correct, meaning that the generated code respects the real- 
time semantics of the original program (respect of periods, deadlines, release dates and precedences) 
as well as its functional semantics (respect of variable consumption). 
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1 Introduction 

Embedded systems have successfully been implemented with synchronous languages in the past. In 
particular, data-flow synchronous languages (LUSTRE/ SCADE ||6l, SIGNAL lUl) are well adapted for de- 
scribing precisely the data flow between the communicating processes of the system. In (5] we proposed 
an extension of synchronous languages to design multi-periodic systems efficiently, by assembling sev- 
eral synchronous nodes into a multi-periodic synchronous program. Such an approach allows to describe 
the real-time aspects and the functional aspects of a system in the same framework. The purpose of the 
paper is to give an overview of the language capabilities and to describe the compilation chain through 
the programming of a case study. The generated code is targeted for a simple real-time platform with 
the earliest-deadline -first (EDF) scheduling policy ||9l . We present the whole compilation chain but we 
do not get into the details of the proofs, which can be found in ||4l. We focus more particularly on the 
generated code, which gives a concrete illustration of the compilation and summarizes our contribution. 



1.1 Motivation 

The development of an industrial critical system may involve several teams, which separately define the 
different functions of the system. The functions are then assembled by the integrator, who implements 
the communications between the functions. Currently, this integration process lacks a formal language 
to ease the design process and to ensure the correctness of the global system. 
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We consider the simplified Fliglit Control System of Fig. [T]as a case study. This system controls the 
attitude, the trajectory and the speed of an airplane in auto-pilot mode. It consists of three communicating 
sub-systems. Each sub-system consists of several operations (represented by boxes in the figure) and 
executes repeatedly at a periodic rate. The fastest sub-system executes at 10ms, it acquires the state of the 
system (angles, position, acceleration) and computes the feedback law of the system. The intermediate 
sub-system is the piloting loop, it executes at 40ms and manages the flight control surfaces of the airplane. 
The slowest sub-system is the navigation loop, it executes at 120ms and determines the acceleration to 
apply. The required position of the airplane is acquired at the slow rate. 
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Figure 1 : Flight control system 



The three sub-systems are first defined separately by different teams. The integrator then assembles 
them in the global system and specifies how they communicate. The language focuses on this assembly 
level. 



1.2 Contribution 

The main novelties of our approach are: first the integrator can develop the complete system in a unified 
formal framework (a high-level formal language) and second the language along with its compiler covers 
the development of the system from its design to its implementation, through automated code generation. 

This relies on two different research domains. On the one hand, scheduling theory focuses mainly on 
satisfying system real-time constraints but usually disregards system functional behaviour. This ensures 
the correctness of the temporal properties of the system, but makes it hard to verify the functional correct- 
ness of the system. In the case of multi-periodic systems, this often leads to non-deterministic process 
communications. On the other hand, synchronous languages focus on the correctness of the functional 
behaviour of the system and ensure that it is deterministic. However, classic synchronous languages ab- 
stract from real-time (except some recent extensions discussed in Sect. [7]), which makes them ill-adapted 
to the implementation of multi-periodic systems. 

Our work combines scheduling theory and synchronous languages to ensure both the functional and 
the temporal correctness of multi-periodic systems. This is particularly suitable for the implementation 
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of critical systems. The integrator programs the system with the language introduced in ||51, which ex- 
tends synchronous languages with high-level real-time primitives. The compiler then generates the set 
of real-time tasks corresponding to this program. Tasks are then translated into C threads, each one con- 
taining the functional code of a task completed with a deterministic data-exchange protocol that does not 
require synchronization primitives (such as semaphores). The threads are scheduled concurrently with 
the EDF policy. They can be preempted by the scheduler during their execution but preemptions do not 
jeopardize the functional correctness of the system. The use of an EDF scheduler departs from the classic 
compilation scheme of synchronous languages, which translates a program into a "single-loop" sequen- 
tial code Q. The single-loop scheme relies on a static -priority non-preemptive scheduling policy, which 
makes it ill-adapted for implementing multi-periodic processes. A dynamic-priority preemptive policy 
like EDF allows to achieve better processor utilization (to execute more time-consuming processes). The 
complete compilation process has been implemented in OCaml and is about 3000 code lines long. It 
generates C code with calls to the real-time primitives defined in the real-time extensions of POSIX |[T3]| . 



1.3 Paper Outline 

Sect. [2]gives an overview of the language for specifying multi-periodic systems and shows how the case 
study can be programmed. We then detail the compilation process. The correctness of the system is first 
verified by a series of static analyses (Sect. [3]l. We then translate the program into a set of real-time tasks 
(Sect. |4]). The preservation of the synchronous semantics during inter-task communications is ensured 
by a buffering protocol described in Sect. |5] We can then translate the tasks into C code for a simple 
real-time platform (Sect. [6]). A comparison with related works is given in Sect. [7] 



2 A Synchronous Real-Time Language 
2.1 Informal Presentation 

We present the language through the programming of the Fhght Control System of Fig. [T] The different 
operations of the Flight Control System are first declared as imported nodes, named after the initials of 
the operations (for instance, PA stands for "Position Acquisition"): 

imported node PA( i : int) returns (o: int) wcet 1; 
imported node AA( i : int) returns (o: int) wcet 1; 

The declaration of an imported node specifies the inputs and outputs of the node with their types and the 
worst case execution time (wcet) of the node. For instance, the node PA has one input i of type int, one 
output o of type int and its wcet is 1 . 

For each sub-system, we define an intermediate node that groups the operations of the sub-system. 
Node definitions are modular and hierarchical. There are several ways to decompose the Flight Con- 
trol System into nodes and it is also possible to program the whole system as a single node. However, 
the different decompositions produce the same behaviour as intermediate nodes are flattened during the 



compilation (see Sect. 4. 1 1. We choose to group operations by sub-systems (and by rates) for better read- 
ability. In the following, the suffix _o stands for "observed", _r for "required" and _i for "intermediate". 
The node for the acquisition loop is defined as follows: 

node acquisition (angle ,pos,acc)returns(pos_i ,acc_i ,angle_r) 
let 

po s _i = PA( pos ) ; 
ac c _i = AA( acc ) ; 



/. Forget, F. Boniol, D. Lesens & C. Pagetti 



37 



a n g 1 e _ r = FL (angle); 
tel 

This node has three inputs and three outputs, the types of which are left unspecified and will be inferred 
by the type-checker of the language (see Sect. [3]l. The body of the node (the let . . . tel block) is a 
set of equations that define how the outputs of the node are computed from its inputs. All the variables 
and expressions of a program are flows. For example, the constant value stands for an infinite constant 
sequence. Nodes are applied point-wisely to their arguments. So, for instance, at each repetition of node 
acquisition, the output pos_i is obtained by applying node PA to input pos. 
Similarly, we define a node for the piloting loop and for the navigation loop: 

node piloting (angle_r , acc_i , acc_r) returns (order) 

var acc_o ; 

let 

acc _o = PF( ac c _i ) ; 

order = PL (angle_r, acc_o , acc_r); 
tel 

node navigation (pos_i , pos_r) returns (acc_r) 

var pos_o ; 

let 

pos_o = NF(pos_i); 
acc_r = NL( pos_o , pos_r ) ; 
tel 

So far, each node could be defined with the existing LUSTRE language as each sub-system is mono- 
periodic. For the main node FCS however, we use new primitives to handle rate transitions (when opera- 
tions of different rates communicate) and to specify the real-time constraints of the different operations: 

node FCS (pos.r: rate (120, 0); angle, pos, acc) returns (order: due 15) 

var acc_i , acc_r , angle.r , pos_i ; 

let 

acc_r = n a V ig at ion ( p o s _i /" 1 2 , po s _r ) ; 

order = p i 1 o t i n g ( an g le _r /"4 , acc_i/"4, (0 fby acc_r)*"3); 
(pos_i, acc_i, angle, r) = acquisition( angle, pos, acc); 
tel 

When a faster node consumes a flow produced by a slower node, we under-sample the flow using operator 
e/"k only keeps the first value out of each k successive values of e. For instance flow acc_i is under- 
sampled by factor 4 as its consumer (piloting) is 4 times slower than its producer (acquisition). 

For communications from slow to fast operations, we first delay the flow with operator fby. The 
operator fby inserts a unitary delay: expression est fby e produces the value est at its first iteration and 
then the previous values of e (ie e delayed by the period of e). We then over-sample the delayed flow 
with operator e over-samples e by a factor k. Each value of e is duplicated k times in the result. 
For instance the flow acc_r is delayed and then over-sampled by a factor 3 as its consumer (piloting) 
is 3 times faster than its producer (navigation). We use a delay before over-sampling the flow to avoid 
reducing the deadline for the production of the flow. In the case of flow acc_r for instance, without a 
delay the deadline for NL would be lower than 40. 

For now, we have only described the ratio between the execution rates of the nodes. The declaration 
of the node inputs simply specifies that pos_r has clock (120,0) (ie that it has period 120 and phase 
0) and all the different rates of the system are deduced from this information by the clock calculus (see 
Sect. [3]). The declaration of output order, imposes a deadline constraint (due 15), which requires it 
to be produced less than 15ms after the beginning of its period, to respect some external environment 
constraint. Its period is left unspecified (its inferred value is 40ms). The behaviour of the new operators 
is illustrated in Fig. [2j in which we give the value of each expression at each repetition of the system. 
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Figure 2: Behaviour of real-time operators 



2.2 Formal Definition: Strictly Periodic Clocks 

In the synchronous approach, the computations performed by the system are split into a succession of 
instants, where each instant corresponds to one repetition of the system. The synchronous assumption 
requires that for each instant, computations end before the end of the instant. Computations can be 
activated or deactivated at different instants using clocks. Clocks define the temporal behaviour of the 
program on the logical time scale of instants. 

To define formally the real-time operators presented in the previous section, we introduce a new class 
of clocks called strictly periodic clocks. Given a set of values Y , a. flow is a sequence of pairs (v,-,?,),^^ 
where v,- is a value in Y and f,- is a tag in N, such that for all /, ti < The clock of a flow is its 
projection on N. A tag represents an amount of time elapsed since the beginning of the execution of the 
program. Each value of a flow must be computed before its next activation: must be produced during 
the time interval After precedence encoding, the deadline may actually be less than ti+i, this 

will be detailed in Sect. 14.3.31 

Definition 1. (Strictly periodic clockj. A clock h = (f,),eN> ti G N, is strictly periodic if and only if: 
3nGW, V/ G N, ti+i - ti = n. 

n is the period ofh, denoted K{h) and to is the phase ofh, denoted (p{h). 

Definition 2. The term {n,p) G N* x Q+ denotes the strictly periodic clock a such that n{(x) = n and 
<p(a) = n{a) * p 

A strictly periodic clock defines the real-time rate of a flow and is uniquely characterized by its phase 
and by its period. Strictly periodic clocks relate logical time (instants) to real-time. Locally, each flow 
has its own notion of instant (it must end before its next activation), and globally we can compare flows 
that do not share the same notion of instants by relating instants to real-time. We introduce periodic clock 
transformations to formalize such rate transitions: 

Definition 3. Let a be a strictly periodic clock, operations / , * and — t- are periodic clock transforma- 
tions, that produce new strictly periodic clocks satisfying the following properties: 

• n{a/ k) =k*n{a), (p{a/k) = (p{a),k g N* 

• 7i{a*,k) = 7i{a)/k, (p{a*,k) = (p{a),k 

• 7i{a -)-. q) = 7i{a), (p{a q) = (p{a)-\-q*n{a),q G Q 

Rate transition operators apply periodic clock transformations to flows. If e has clock a, then e/"k 
has clock a/ ,k, e *"k has clock a*,k and e ^> q has clock a^ q. 
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2.3 Syntax 

The syntax of the language is close to LUSTRE. It is extended with real-time primitives based on strictly 
periodic clocks. The grammar of the language is given in Fig. [3] A program consists of a list of 
node declarations (nd). Nodes can either be defined in the program (node) or implemented outside 
(imported node), for instance by a C function. Node durations must be provided for each imported 
node, more precisely an upper bound on worst case execution times (wcet). The external code provided 
for imported nodes can itself be generated by a standard synchronous language compiler (like Lus- 
tre), in case developers want to program the whole system using synchronous languages. The clock 
of input/output parameters (in/out) of a node can be declared strictly periodic (x : rate{n,p), x then has 
clock {n,p)) or unspecified (x). A deadline constraint can be imposed on outputs (x : rate{n,p) due n', 
the deadline is n'). The body of a node consists of an optional list of local variables (var) and a list of 
equations (eq). Each equation defines the value of one or several variables using an expression on flows 
(var = e). Expressions may be immediate constants {est), variables {x), pairs ((e,e)), initialised delays 
{est f by e), applications {N{e)) or expressions using strictly periodic clocks {epek). Values k, n, n', q 
must be statically evaluable. Value q must be an element of Q+. 



est ::= true | false | | ... 

var ::= x\ var, var 

e ::= erf | :t | {e,e) \ est ihy e \ N{e) \ epek 

epek ::= e /"k\ e*"k\ e ^> q 

eq ::= var = e\eq',eq 

in ::= x: rate{n,p) \ x \ in;in 

out ::= x:rate{n,p)\x:rate{n,p) dnen' \x\out',out 

nd ::= node A'^(/?i) returns (oMf)[ var var;] let tel 
I imported node N{in) returns {out) wcet n; 



Figure 3: Language grammar 



3 Static Analyses 

Synchronous languages are targeted for critical systems. Therefore, the compilation process puts strong 
emphasis on the verification of the correctness of the programs to compile. This consists of a series of 
static analyses of the program, which are performed before code generation. 

The first analysis performed by the compiler is the type-checking. The language is a strongly typed 
language, in the sense that the execution of a program cannot produce a run-time type error. Each flow 
has a single, well-defined type and only flows of the same type can be combined. The type-checking 
of the language is fairly standard fT2l . For the example of the Flight Control System, the type-checker 
produces the flowing type for node FCS: (int*int*int*int)->int. This means that the node takes 
four integer inputs and produces one integer output. This type is inferred from the types of the imported 
nodes. 

The causality check verifies that the program does not contain cyclic definitions: a variable cannot 
instantaneously depend on itself (i.e. not without a f by in the dependencies). For instance, the equation 
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x=x+l ; is incorrect, it is similar to a deadlock since we need to evaluate x+1 to evaluate x and we need 
to evaluate x to evaluate x+1. 

The clock calculus (defined in |i5j|) verifies that a program only combines flows that have the same 
clock. When two flows have the same clock, they are synchronous as they are always present at the 
same instants. Combining non-synchronous flows leads to non-deterministic programs as we access to 
undefined values. For instance we can only compute the sum of two synchronous flow, because the 
value of the sum is ill-defined when only one of the two flows is absent (when the two flows are absent 
the sum flow is simply absent, which is well-defined). The clock calculus ensures that a synchronous 
program will never access to undefined values. For the example of the Flight Control System, the clock- 
calculus produces the following clock for node FCS: ((120,0)*(10,0)*(10,0)*(10,0))->(40,0). 
This means that inputs angle, acc, position have period 10, while input position_r has period 120. 
The output order has period 40 (though its deadline is 10). These static analyses ensure that a program 
accepted by the compiler has a deterministic behaviour. 



4 Translation into Real-Time Tasks 

This section details how the compilation process translates the input program into a set of real-time tasks. 
We first extract a task graph from the program, where tasks are related by precedence constraints. We 
then encode task precedences in task real-time attributes to obtain a set of independent tasks. 

4.1 Task Graph Extraction 
4.1.1 Tasks 

A synchronous program consists of a hierarchy of nodes, the leaves of which are predefined or imported 
nodes. The task graph generation process first inlines intermediate nodes appearing in the main node 
recursively, replacing each intermediate node call by its set of equations. For instance, the program of 



the Flight Control System of Sect. 2. 1 is translated into a single node (the main node FCS) containing 
one node call to each imported node, PA, AA, PL, PF, PL, NF, NL, one node call to operator one node 
call to operator f by and three node calls to operator 

This "flattened" main node is then translated into a task graph. Each imported node call is translated 
into a task. Each variable of the node and predefined operator call is also translated into a vertex but will 



later be reduced to simplify the graph (see Sect. 4.1.31. The clustering of several nodes into the same 



task to reduce the number of generated tasks is beyond the scope of this paper. We could probably reuse 
existing strategies, for instance those suggested in ||3l. 



4.1.2 Task Precedences 

In order to respect the synchronous semantics, for each data-dependency there must be a precedence 
from the task producing the data to the task consuming it. Task precedences are deduced from data de- 
pendencies between expressions of the program. Similarly to fT], we say that an expression e' precedes 
an expression e when e syntactically depends on e'. This occurs either when e' appears in e or when x 
appears in e and we have an equation x = e. Let g = {V,E) denote a task graph, where V is the set of tasks 
of the graph and E is the set of task precedences of the graph (a subset of V x V). For instance, the flat- 
tened Flight Control program contains the two equations pos_o=NF(pos_i/"12) ; pos_i=PA(pos) ;. 
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These equations produce the following graph: ({NF,PA,/"12,pos,pos_i,pos_o}, {pos— PA, PA— 
pos_i, pos_i-;> /"12, /"12-> NF, NF pos_o}). 

4.1.3 Task Graph Reduction 

We then simplify the intermediate graph structure. First, each input of the main node is translated into 
a (sensor) task and each output of the main node is translated into a (actuator) task. Second, we remove 
variables from the graph, replacing recursively each pair of precedence N ^ x ^ M, where ;c is a local 
variable, by a single precedence N ^ M. 

We finally translate predefined nodes into precedence annotations. A precedence ^ Zj represents 
an extended precedence, where ops is a list of precedence annotations op, with op G {*kj"k,^> 
q, f by}, op.ops denotes the list whose head is op and whose tail is ops and e denotes the empty list. For 
instance, the precedences PA—)- pos_i, pos_i— )• /"12, /"12— )• NF are simplified into a single extended 

/-12 

precedence PA — )• NF. When the rewriting terminates, every task of the graph corresponds to either an 
imported node or a sensor or an actuator. The reduced task graph of the Flight Control System is given 
in Fig. g 



order 




Figure 4: Reduced task graph for the Flight Control System program 



4.2 Real-Time Characteristics Extraction 

Each task of the graph is characterized by its real-time attributes (7]-,C;,r,-,<i,). T,- is instantiated peri- 
odically with period 7]. It cannot start its execution before all its predecessors, defined by the precedence 
constraints, complete their execution. Q is the (worst case) execution time of the task, r,- is the release 
date of the first instance of the task. The subsequent release dates are r, + 7]-, r, + IT,, etc. di is the relative 
deadline of the task. The absolute deadline Di[j] of the instance j of a task T,- is the release date Ri[j] of 
this instance plus the relative deadline of the task: Di[j] = Rj[j] + di. Task real-time characteristics are 
extracted as follows: 

• Periods: The period of a task is obtained from its clock cki. We have Tj = 7i{cki). 

• Deadlines: By default, the deadline of a task is its period (J, = 7]). Deadline constraints can also 
be specified on the production of a node output (o : due n). 

• Release Dates: The initial release date of a task is the phase of its clock: r,- = (p{cki). 
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• Execution Times: The execution time Q of a task is directly specified by the wcet of the im- 
ported node declaration. For simplification, we consider that the run-time overhead due to task 
preemptions is negligible. 

4.3 Precedence Encoding 

4.3.1 Simple Precedences Encoding 

ll2l showed that a set of dependent tasks (task related by precedence constraints) can be reduced to a set 
of independent ones (without precedences) obtaining an equivalent problem under the EDF policy, by 
adjusting task release dates and deadlines such that precedences are encoded in the adjusted real-time 
characteristics. The adjusted absolute deadUne D* of a task is: 

D* = min {Di,min {D* -Cj)) 
Tjesucc{Ti) 

If we want to perform a schedulability test, the adjusted release date of a task is: 

R* = max {Ri,max{R*+Cj)) 

If we only want to schedule the program correctly, the adjusted release date of a task R*' is: 

R*i = max {Ri,max{R*j )) 

■Zjepred(Zi) 

For simplification, we only consider the second encoding in the following. 

4.3.2 Extended Precedences Encoding 

Fig. |5] shows that we can unfold extended precedences between tasks into simple precedences between 
task instances. The precedence encoding technique can then be applied to the "unfolded" graph. 
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Figure 5: Encoding extended precedences 



More formally, let T, [n] — )■ Xj [n'] denote a precedence from task instance [n] to task instance Xj [n'] . 



From the semantics of predefined operators, we have T, Xj =^ V?i, Xi[n] — )• Xj[gops{n)], with gaps defined 
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as follows: 

g*k.ops {n) = gaps {kn) grk.ops («) = gaps {\n/k\) 

g^>q.ops{n) = gops{n) gi^y. opsin) = gops{n+l) 

ge{n) =n 

The precedence relation is an over-approximation of the data-dependency relation. Indeed, there is a 
data dependency between T, [«] and Tj[n'], meaning that Tj[n'] consumes the data produced by T,[n], if and 
only if Ti Tj An' = gops{n) Agopsin) ^ gopsin + 1). 

We can then adapt the encoding to our context. For each precedence Zj, we must adjust the 
release dates and deadUnes of each instance T, [n] such that R* [n] < R*j [gaps («)] and D* [n] < D*j [gop, (n)] — 
Cj. Concerning release dates, we can easily prove that thanks to the synchronous semantics we already 
have Ri[n] < Rj[gopsin)], so release dates do not need to be adjusted. Concerning deadlines, we need 
to transpose the formulae to relative deadlines to fit our task model. From the definition of relative 
deadlines: Dj[n]<D*j[gopsin)]-Cj<^di[n]<dj[gops{n)] + rj + gopsin)Tj-ri-nTi-Cj. 

4.3.3 Deadline Calculus 

In practice we do not need to unfold extended precedences to perform their encoding. Instead, we 
represent the sequence of deadlines of the instances of a task as a finite repetitive pattern called deadline 
word. A unitary deadline specifies the relative deadline for the computation of a single instance of a 
task. It is simply an integer value d. A deadline word defines the sequence of unitary deadlines for each 
instance of a task. The set of deadline words is defined by the following grammar: w ::= (m)® m ::= 
d I d.u. Term (m)® denotes the infinite repetition of word u. In the following, w[n] denotes the n'^ unitary 
deadline of deadline word w (« G N). 

Let Wi denote the deadline word of task t,. A precedence T,- Tj is encoded by a constraint relating 
Wi to Wj of the form: 

Wi < Wopsiwj) + Aop,{Ti, Tj) - Cj + ry - r,- 

where, for all n, Wops{wj)[n] = Wj[gops{n)] and AopsiTi,Tj)[n] = gops{n)Tj - nTi. 
Let Wops{T:j) = Wopsiwj) +Kps{Tu Tj) - Cj + rj - r,-. 

Property 1. Aops(Ti,Tj) is periodic and can be represented as a deadline word. The set of deadline 
words is closed under operation Wops and under deadline words addition. As a consequence, y/opsi'^j) is 
a deadline word. 

Proof. By induction. □ 

Property 2. The deadline words of a task graph g = {V,E) can be computed with complexity ^(\V\ + 
\E \ * \ wmax\)> where Wmax denotes the deadline word which has the longest size in the task graph. 

Proof. A reduced task graph is a DAG (when we do not consider delayed precedences), so we can 
compute the deadUne words of the graph by performing a topological sort working backwards (starting 
from the end of the graph). As the complexity of a topological sort is | -I- !£ | ), the complexity of the 
algorithm is ^{\V\ + \E\ * |w;„ax:|) where w^ax denotes the longest deadline word in the task graph. □ 
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For instance, for the Flight Control System program, we take Cpa = 1, Caa = 1> Cfi = 3, Cpp = 4, 
CpL = 6, Cnl = 20, Cnf = 5. To simplify, we take null durations for sensors and actuators. The result 
of the deadline calculus is: wpa = (10)'", waa = (5.10.10.10)'*', wpL = (9.10.10.10)'", wpp = (9)'", 
wpL = (15)'", wnl = (120)'", wnf = (100)'". The deadline words of tasks AA and PA state that, each first 
repetition out of four successive repetitions the two tasks have a shorter deadline as PF and PL execute. 
Notice that if we set deadline 5 for all the instances of AA (instead of a deadline word), this example is 
not schedulable. 



5 Communication Protocol 



As task precedences are encoded in task deadlines, inter-task communications do note require synchro- 
nization primitives (like semaphores for instance). Indeed, as long as tasks respect their deadlines, data 
is produced before being consumed, so tasks simply read from and write to some communication buffers 
allocated in a global shared memory when they execute. However, to respect the synchronous seman- 
tics, the input of a task must not change during its execution. Therefore, we propose a communication 
scheme, which ensures that the input of a task remains available until its deadline. 

For a precedence T,- Ty, data produced by Tj[n] may be consumed by Tj[gops{n)] after Tj[n + 1] 
has started. This is illustrated in Fig. [6| which shows the schedule of two tasks related by extended 
precedences. Vertical lines on the time axis represent task periods and marks represent task preemptions. 
An arrow from A at date f to B at date t' means that task B may read at time t' from the value produced 
by A at time t. In Fig. 6(a) 6(c) and 6(d)[ when A[l] executes, it must not overwrite the data produced 
by A[0] because it is consumed by B[0] and B[0] is not complete yet. Therefore, we need a buffer to keep 



the value of A [n] after A + 1] has started. In Fig. 6(b) the same data is consumed several times but A [1 



can freely overwrite the data produced by A[0], so no specific communication scheme is required. 
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Figure 6: Communications for extended precedences 



The communication protocol allocates a buffer for each extended precedence of the graph. The pro- 
ducer of the data writes to the buffer only if gops (") 7^ Sops (" + 1 ) (see the definition of data-dependencies 



in Sect. 4.3 1. The consumer simply reads from this buffer each time it executes. We allocate a double- 
buffer when the precedence contains a f by or a ~>, to keep the previous and the current value of the 



data. This communication scheme is illustrated in more details in Sect. 6.1 It is of course not optimal 
because in many cases when we consider a set of precedences from a single task T, to several tasks Xj we 
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can use the same communication buffer for some of the tasks Tj. Optimization will be treated in future 
work, we could for instance adapt the communication scheme proposed by |[T5l to our language. 

6 Code Generation 

The compiler generates C code with calls to the real-time primitives defined in the real-time extensions 
of the POSIX standard (P0SIX.13 |[T3l ). The code generation can easily be adapted to any real-time 
operating system that provides dynamic priority scheduling. Each task is translated into a thread and the 
threads are executed concurrently by an EDF scheduler modified to handle deadline words. 

6.1 Task Code Generation 

The generated code consists of a single C file. The file starts with the declaration of one global vari- 
able for each communication buffer. For instance, for the communication from PF to PL (acc_o be- 
fore graph expansion), we have the declaration: int PF_o_PL_il (named after the output of the pro- 
ducer and the input of the consumer). For the communication from PL to FL, we have the declaration: 
int PL_o_FL_i2 [2] , as there is a delay between the two tasks. 

The file then contains a function for each task of the task set. The function mainly consists of an 
infinite loop, that wraps the function of the corresponding imported node with the code of the commu- 
nication protocol. One step of the loop corresponds to the execution of one instance of the task. Once 
buffer updates are complete, the function signals to the scheduler that the current task instance completed 
its execution so that it can schedule another task. 

For instance, the function for task PL is given in Fig. [7] The value returned by the external func- 
tion PL is a single integer so we can directly assign its return value to an integer variable. When the 
external function returns a tuple, the output value is returned as a struct pointer in the parameters of 
the function. The variable NL_o_PL_i3 is the communication buffer for precedence NL "1:^*^ pj^ n 
is an array of size 2 as the precedence contains a delay. The delay is initialised at the beginning of 
the function. PL alternatively reads from value 1 and from value of the array, starting with value 
1 (NL_o_PL_i3 [(instance+l)7o2] ). PL then copies its output value to the communication buffer 
PL_o_order for precedence PL — )• order. Instruction invok;e_scheduler (0) signals the completion 
of the task instance. 

void *PL_fun(void * arg ) { 

NL_o_PL_i3 [ 1 ] = 0; int instance=0; 
while (1) { 

PL_o = PL(FL_o_PL_il , PF_o_PL_i2 , NL_o_PL_i3 [ ( in s t anc e + 1 ) % 2] ) ; 
PL_o_order=PL_o ; 
instance ++; 
invoke_scheduler (0); 

} 

} 

Figure 7: Code generated for task PL 
The functions for tasks FL and NL, which produce data used by PL are given in Fig. [8] Variable 

/"4 

FL_o_PL_il is the communication buffer for precedence FL — )• PL. It is updated once every 4 iterations. 
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(update_FL_o_PL_il [instance%4] ). For precedence NL —> PL, NL alternatively copies its output 
value to the value and to the value 1 of the buffer NL_o_PL_i3. 

void *FL_fun(void * arg ) { 

int update_FL_o_PL_i 1 [4] = { 1 ,0 ,0 ,0} ; int instance=0; 
while (1) { 

FL_o = FL( angle.FL.il ); 

if (update _FL_o_PL_il [ instance %4]) 

FL_o_PL_il=FL_o; 
instance ++; 
in voke _sc hedu ler (0); 

} 

} 

void *NL_fun(void * arg) { 
int instance =0; 
while (1) { 

NL_o = NL(NF_o_NLJl ,pos_r_NLJ2); 
NL_o_PL_i3 [instance%2]=NL_o; 
instance ++; 
invoke_scheduler (0); 

} 

} 



Figure 8: Code generated for tasks FL and NL 

The main function then creates one thread for each task, initializes the EDF scheduler and at- 
taches the threads to it. For instance, the thread for task PL is created by the following function call: 
pthread_create (&tPL , feattrPL , PL_f un , 

NULL) . tPL is the thread created for this task. attrPL contains the real-time attributes of the task. 
PL_f un is the function executed by the thread. The last parameter NULL stands for the arguments of 
PL_fun. 



6.2 Implementing EDF with Variable Deadlines 

We choose to prototype the scheduler using MaRTE Operating System |!T4l|, which was designed specif- 
ically to ease the implementation of application-specific schedulers while remaining close to the POSIX 
model. We modify the EDF scheduler provided with the OS to support deadline words. To summarize, 
the EDF scheduler is defined as a high-priority thread created by the main function of the file. The sched- 
uler thread is itself scheduled by the kernel of the OS. It becomes active only when scheduling actions 
must be taken, which is when the following scheduling events (implemented by means of signals) occur: 
the current instance of a task completes its execution or a new task instance is released. The scheduler 
then computes the most urgent task among the ready tasks (tasks released and not complete yet), resumes 
the execution of the corresponding thread where it stopped and suspends the execution of the currently 
executing thread, if any. The scheduler thread then becomes inactive until the next scheduling event. 

The support of deadline words requires very few modifications. We define deadline words and mod- 
ify the structure describing task real-time attributes as shown in Fig. [9] Then, we modify the function that 
programs the next instance of a task when the current instance completes. For a task T,, the attributes of 
which are described by the value t_data, the release date and the deadline of its new instance are com- 



puted as described in Fig. 10(D,[?i] =Ri[n]+di[n]). The function incr_timespec (tl ,t2) increments 



tl by 1 2 and the function (tl,t2,t3) sets tl to t2+t3. 
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typedef struct dword { struct timespec *dds; 
typedef struct thread_data { 
struct timespec period; 

initial_release ; 



int wsize ; } dword.t; 



struct timespec 
dword.t dword; 
struct timespec 
struct timespec 
int instance ; 
thread_data_t ; 



next.deadline ; 
next_ release ; 



Figure 9: Data type representing task real-time attributes 

dword.t dw = t.data — >dword ; t _d at a — >i n s t an c e ++; 
incr_timespec (&t_data — >nex t _rele a se ,& t _dat a — >period ) ; 
add_timespec (& t .data —>next_de adline ,& t .data —>next _re lease , 
&(dw . dds [t_data — >i n s t anc e'Mw .wsize ])) ; 



Figure 10: Releasing a new task instance 



We can see that the overhead due to the support of deadline words remains very reasonable. Alto- 
gether, we modified about 20 lines of code of the original EDF scheduler, which is 300 lines of code 
long. We compiled and executed the C code generated for the Flight Control System and it behaved as 
expected. 

7 Related Works 

The language used in this article relies on a specific class of clocks to handle the multi-periodic aspects 
of a system. Real-time periodic clocks have also been introduced in [3], but they do not include clock 
transformations to efficiently handle rate transitions. Our rate-transition operators are also very similar 
to the rate transition blocks of SiMULiNK [10]. Yet, as far as we know, for models using such blocks, 
the code generation tool (Real-Time Workshop) relies on a Rate-Monotonic scheduler jOj used with 
semaphores to handle task communications, which is not an optimal scheduling policy for this scheduling 
problem. 

The scheduling of multi-rate Synchronous Data Flow (SDF) is a well studied problem (see for in- 
stance m [161 [HI)- In particular llT6ll studies the implementation of SDF with a dynamic scheduler 
using preemption. However, though multi-rate systems are at the chore of SDF graphs, SDF operations 
are not periodic, they are not released periodically and their relative deadline is not their period. As a 
consequence, these results do not apply to our problem. 

lITSll deals with the execution of a set of synchronous tasks, the semantics of which is very close to 
our task sets, with a dynamic scheduler. However, the authors do not detail how the synchronous task set 
is obtained, for instance how it is translated from a synchronous language, and task precedences are not 
specified in the task set. 

A simple solution to the problem of scheduling tasks related by extended precedences is to unfold the 
extended precedence graph on the hyperperiod of the tasks and use [2] to encode the simple precedences 
of the unfolded graph. This solution replaces each task T, by HP /Ti duplicates (where HP is the hyper- 
period of the tasks) in the unfolded graph. This can lead to important computation overhead at execution 
as the scheduler needs to make its decisions according to a task set that will contain many tasks. Our 
solution does not duplicate any task, so the scheduler takes less time to make its decisions. 
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8 Conclusion 

We proposed a language for programming critical systems with multiple real-time constraints, along 
with its compiler, which automatically translates a program into a set of independent real-time tasks 
programmed in C with P0S1X.13 real-time extensions. The generated code is schedulable optimally 
by a slightly modified EDF scheduler and requires no synchronization primitives. Though tasks are 
scheduled concurrently and preemptions are allowed, the generated program respects the real-time and 
the functional semantics of the original program. 
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